1. Introduction to genetic testing and the underlying technology

Personal genetic testing is transforming the biological and medical fields as humans now have access to their genetic instructions. However, every positive such as personalized medicine, assistance in criminal investigations and increase in research are all accompanied by a negative foil. Personalized medicine is a privilege that seems to be given to those in higher socioeconomic classes, criminal investigations are riddled with privacy and consent obstacles, while research has taken advantage of minorities or has greatly catered to those with a European ancestry. There is also high concern of privacy and having such personal data being leaked and placed in the wrong hands.

DNA polymorphisms, most common which are SNPs, can be detected in genomes in various ways. In past decades, methods such as Southern blots, PCRs and hybridization techniques using microarray chips have been used for genome sequencing. Recently, DNA-based molecular markers have been a breakthrough technology that is used to detect SNPs as these markers can easily identify particular DNA sequences.

2. My genealogy and family history

I was very lucky and had very documented information about my family history on my paternal side, almost no questions were left unanswered. Our documents included birth locations and dates, death locations and deaths, occupation, relocations, any achievements or major life events. My mother’s side, however, is a complete mystery beyond just immediate family.

Geneology Tree Family Tree

3. DNA matches and relatives in the databases

I would expect to share around 50% of my DNA with my parents, 25% with each of my grandparents and 12.5% with each great grandparent and so on. However, it is important to distinguish the difference between genealogical ancestors and genetic ancestors as the later are the ones that I actually got some DNA from. This, of course, is regarding autosomal DNA as the sex chromosomes and maternal DNA are more directly passed down. After about eight generations back is when it is expected that the number of genetic ancestors increases linearly rather than exponentially, while the number of genealogical ancestors only begins to increase exponentially. So starting around the eight generation is where I would begin having ancestors with which I have no genetic similarity.

DNA Relatives I have around the expected amount of DNA shared with my aunt (25 % expected and 23.34 % observed) as well as with my first cousin once removed (6.25 % expected and 5.17 % observed). I share 2.5 % and 2.11 % with two of my second cousins and it is expected I share 3.13 %. I share 1.78 %, 1.58 % and 1.14 % with three of my third cousins and it is expected I share 0.78 %.

4. Ethnicity estimates

My ethnicity results were exactly what I predicted. I already knew from my paternal side that I am 50 % Ashkenazi Jewish as my paternal side of the family is 100% Ashkenazi Jewish. My maternal side of the family could potentially have had some surprises, but I got results that I expected. My 47 % Eastern European is all from my mother who believes that her family is Don Cossack, and I even expected the 2 % East Asian and Native American results as Russian history, and thus genealogy, was heavily influenced by the Mongol Empire that ruled over Russia in the 13th and 14th centuries. The German and French influence is most likely from my paternal side as throughout Jewish migration through Europe, many originally stayed in Germany. There was also 0.2 % Central Asian (Kazakhstan, Uzbekistan, Turkmenistan, etc.) trace DNA as well as 0.2 % North African and West Asian DNA. The Central Asian DNA either also came from Mongol rule over Russia, or a potential modern explanation could be that due the USSR which included the Central Asian region. I am not sure where the 0.4 % trace North African DNA comes in. I could further test my theories by looking at DNA similarities between me and some of the people 23andMe listed as possible relatives, which I did. By looking at my Aunt from my paternal side I could confirm that my Central Asian DNA mostly likely came from my Mother as my Aunt had no Central Asian DNA. She did, however, have 1 % North African and West Asian DNA which means that that those results are most likely from my Jewish Ancestors.

Ethnicity Percent Ethnicity Map

5. Older family history based on mitochondrial and Y chromosomes results

The maternal and paternal haplogroups offer approximated ancestry information from ten to hundred thousand years ago. This is because both mitochondrial DNA and Y chromosome DNA have a slower mutation rate and therefore are generally conserved. Therefore, any mutations are significant and can be traced back to hundreds of generations. Through family members that have also taken a 23andMe test I was able to gather more information about my mitochondrial and Y chromosome results.

Mother: Mitochondrial: H1u Y chromosome: ?

Father: Mitochondrial: W3 Y chromosome: E-L29

My mitochondrial haplogroup is H1u, while if I had a Y chromosome, my haplogroup would be E-L29. After further research, I discovered that it is proposed that the H1u lineage split off from other H groups around modern-day Azerbaijan, general Caucus area. This lines up with my Mother’s Don Cossacks ancestry as Don Cossacks are believed to have originated in the North Caucuses. The most common maternal haplogroup for Ashkenazi Jews is K, so it is interesting that my paternal maternal haplogroup is W which is most common in Pakistan and Northern Indian. I found some research that believes that W3 originated in the Middle East but spread to Europe around 15,000 years ago and spans across regions of Russia, to North Africa, Caucasus, the Near East, Mongolia and the Indian Subcontinent. E-L29 also originated in the Middle East about 4,000 years ago and is extremely common in Ashkenazi Jews.

6. Medically important genotypes and 7. Interesting genotypes

library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.3     ✓ purrr   0.3.4
## ✓ tibble  3.1.0     ✓ dplyr   1.0.5
## ✓ tidyr   1.1.3     ✓ stringr 1.4.0
## ✓ readr   1.4.0     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(DT)
library(gwascat)
## gwascat loaded.  Use makeCurrentGwascat() to extract current image.
##  from EBI.  The data folder of this package has some legacy extracts.
# Load Files 
mySNPs <- read_tsv("data/genome_A_G_v5_Full_20210322120133.txt", comment = '#',
col_types = 
  cols(
    rsid = col_character(),
    chromosome = col_factor(),
    position = col_integer(),
    genotype = col_factor()
  ))
updated_gwas_data <- as.data.frame(makeCurrentGwascat())
## running read.delim on http://www.ebi.ac.uk/gwas/api/search/downloads/alternative...
## formatting gwaswloc instance...
## NOTE: input data had non-ASCII characters replaced by '*'.
## Warning in which(!is.na(as.numeric(df$CHR_POS))): NAs introduced by coercion
## Warning in gwdf2GRanges(tab, extractDate = as.character(Sys.Date())): NAs
## introduced by coercion
## done.
max(updated_gwas_data$DATE.ADDED.TO.CATALOG)
## [1] "2021-04-16"
last_update <- max(updated_gwas_data$DATE.ADDED.TO.CATALOG)

filter(updated_gwas_data, DATE.ADDED.TO.CATALOG == last_update) %>% select(STUDY) %>% distinct()
##                                                                                                                                                                            STUDY
## 1                                                                    Low-frequency variation near common germline susceptibility loci are associated with risk of Ewing sarcoma.
## 2 The Genetics of Circulating Resistin Level, A Biomarker for Cardiovascular Diseases, Is Informed by Mendelian Randomization and the Unique Characteristics of African Genomes.
## 3                                                                                              Genetic Architecture of Abdominal Aortic Aneurysm in the Million Veteran Program.
## 4                                         A genome-wide association study on fish consumption in a Japanese population-the Japan Multi-Institutional Collaborative Cohort study.
## 5                                                        GWAS of peptic ulcer disease implicates Helicobacter pylori infection, other gastrointestinal disorders and depression.
## 6                                                             Genetic basis of lacunar stroke: a pooled analysis of individual patient data and genome-wide association studies.
filter(updated_gwas_data, DATE.ADDED.TO.CATALOG == last_update) %>% select(LINK) %>% distinct()
##                                   LINK
## 1 www.ncbi.nlm.nih.gov/pubmed/32881892
## 2 www.ncbi.nlm.nih.gov/pubmed/32876488
## 3 www.ncbi.nlm.nih.gov/pubmed/32981348
## 4 www.ncbi.nlm.nih.gov/pubmed/32895509
## 5 www.ncbi.nlm.nih.gov/pubmed/33608531
## 6 www.ncbi.nlm.nih.gov/pubmed/33773637
mySNPs_gwas_table <- inner_join(mySNPs, updated_gwas_data, by = c("rsid" = "SNPS"))
mySNPs_gwas_table$risk_allele_clean <- str_sub(mySNPs_gwas_table$STRONGEST.SNP.RISK.ALLELE, -1)
mySNPs_gwas_table$my_allele_1 <- str_sub(mySNPs_gwas_table$genotype, 1, 1)
mySNPs_gwas_table$my_allele_2 <- str_sub(mySNPs_gwas_table$genotype, 2, 2)
mySNPs_gwas_table$have_risk_allele_count <- if_else(mySNPs_gwas_table$my_allele_1 == mySNPs_gwas_table$risk_allele_clean, 1, 0) + if_else(mySNPs_gwas_table$my_allele_2 == mySNPs_gwas_table$risk_allele_clean, 1, 0)

There are three medical concerns that I wanted to investigate with my SNPs. Asthma, type-2 diabetes and Crohns/ associated IBD or IBS SNPS

I have a family history of type-2 diabetes. I wanted to look more into it to see what SNPs I have associated with diabetes risk.

filter(mySNPs_gwas_table, have_risk_allele_count >= 1) %>%
 select(rsid, DISEASE.TRAIT, risk_allele = risk_allele_clean, your_geneotype = genotype, freq = RISK.ALLELE.FREQUENCY) %>% 
 filter(str_detect(tolower(DISEASE.TRAIT), "diabetes")) %>%
 distinct()
## # A tibble: 593 x 5
##    rsid     DISEASE.TRAIT                      risk_allele your_geneotype   freq
##    <chr>    <chr>                              <chr>       <fct>           <dbl>
##  1 rs12116… Type 2 diabetes                    G           AG              0.354
##  2 rs17106… Type 2 diabetes                    G           GG             NA    
##  3 rs12140… Type 2 diabetes                    G           GG              0.905
##  4 rs12140… Type 2 diabetes                    G           GG              0.913
##  5 rs602633 Coronary heart disease x type 2 d… T           GT              0.216
##  6 rs22824… Type 2 diabetes                    G           GG              0.715
##  7 rs6032   Macrovascular complications in ty… T           TT              0.37 
##  8 rs40774… Cystic fibrosis-related diabetes   A           AA              0.580
##  9 rs30245… Type 1 diabetes                    G           GG             NA    
## 10 rs30245… Type 1 diabetes                    G           GG              0.84 
## # … with 583 more rows
filter(mySNPs_gwas_table, have_risk_allele_count >= 1) %>%
 select(rsid, DISEASE.TRAIT, risk_allele = risk_allele_clean, your_geneotype = genotype, freq = RISK.ALLELE.FREQUENCY) %>% 
 filter(freq >= 0.5) %>%
 filter(str_detect(tolower(DISEASE.TRAIT), "diabetes")) %>%
 distinct()
## # A tibble: 307 x 5
##    rsid       DISEASE.TRAIT                    risk_allele your_geneotype  freq
##    <chr>      <chr>                            <chr>       <fct>          <dbl>
##  1 rs12140153 Type 2 diabetes                  G           GG             0.905
##  2 rs12140153 Type 2 diabetes                  G           GG             0.913
##  3 rs2282456  Type 2 diabetes                  G           GG             0.715
##  4 rs4077468  Cystic fibrosis-related diabetes A           AA             0.580
##  5 rs3024505  Type 1 diabetes                  G           GG             0.84 
##  6 rs340874   Type 2 diabetes                  C           CT             0.556
##  7 rs340874   Type 2 diabetes                  C           CT             0.531
##  8 rs340874   Type 2 diabetes                  C           CT             0.553
##  9 rs2867125  Type 2 diabetes                  C           CC             0.83 
## 10 rs35913461 Type 2 diabetes                  C           CC             0.829
## # … with 297 more rows
filter(mySNPs, rsid == "rs12255372")
## # A tibble: 1 x 4
##   rsid       chromosome  position genotype
##   <chr>      <fct>          <int> <fct>   
## 1 rs12255372 10         114808902 GG

No linked to type-2 diabetes and breast cancer

filter(mySNPs, rsid == "rs4402960")
## # A tibble: 1 x 4
##   rsid      chromosome  position genotype
##   <chr>     <fct>          <int> <fct>   
## 1 rs4402960 3          185511687 GG

No linked to type-2 diabetes

filter(mySNPs, rsid == "rs7754840")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs7754840 6          20661250 CG

No linked to type-2 diabetes

filter(mySNPs, rsid == "rs12255372")
## # A tibble: 1 x 4
##   rsid       chromosome  position genotype
##   <chr>      <fct>          <int> <fct>   
## 1 rs12255372 10         114808902 GG

no increased risk of T2D

Carrying two copies of a common variant of TCF7L2 doubles your chances of developing diabetes and puts you in a similar risk category to being clinically obese

filter(mySNPs, rsid == "rs7903146")
## # A tibble: 1 x 4
##   rsid      chromosome  position genotype
##   <chr>     <fct>          <int> <fct>   
## 1 rs7903146 10         114758349 CC

Normal (lower) risk of Type 2 Diabetes and Gestational Diabetes.

filter(mySNPs, rsid == "rs12255372")
## # A tibble: 1 x 4
##   rsid       chromosome  position genotype
##   <chr>      <fct>          <int> <fct>   
## 1 rs12255372 10         114808902 GG

no increased risk of T2D

I do have plenty of risk alleles associated with T2D and a large majority of them have a risk allele frequency greater than 0.5, I believe if I maintain a healthy lifestyle I can, hopefully, avoid developing T2D. I also have a few alleles associated with lower risk of T2D which supports my healthy lifestyle hypothesis. The majority of my risk alleles were not found on SNPedia or did not have a published magnitude.

filter(mySNPs_gwas_table, have_risk_allele_count >= 1) %>%
 select(rsid, DISEASE.TRAIT, risk_allele = risk_allele_clean, your_geneotype = genotype, freq = RISK.ALLELE.FREQUENCY) %>% 
 filter(str_detect(tolower(DISEASE.TRAIT), "celiac")) %>%
 distinct()
## # A tibble: 18 x 5
##    rsid      DISEASE.TRAIT                     risk_allele your_geneotype   freq
##    <chr>     <chr>                             <chr>       <fct>           <dbl>
##  1 rs130034… Celiac disease                    G           AG              0.4  
##  2 rs130034… Celiac disease                    G           AG              0.388
##  3 rs101882… Crohn's disease and celiac disea… C           CT             NA    
##  4 rs7574865 Celiac disease or Rheumatoid art… T           GT             NA    
##  5 rs4678523 Celiac disease                    C           CT              0.313
##  6 rs117121… Celiac disease                    G           GT             NA    
##  7 rs6822844 Celiac disease                    G           GG              0.82 
##  8 rs424232  Celiac disease                    C           CC             NA    
##  9 rs108064… Celiac disease                    A           AC              0.4  
## 10 rs2041570 Refractory celiac disease type II A           AG              0.41 
## 11 rs119840… Celiac disease or Rheumatoid art… G           AG             NA    
## 12 rs1953126 Celiac disease or Rheumatoid art… T           CT             NA    
## 13 rs1250552 Celiac disease                    A           AG              0.53 
## 14 rs7104791 Celiac disease                    T           CT             NA    
## 15 rs3184504 Celiac disease                    C           CT              0.488
## 16 rs653178  Celiac disease or Rheumatoid art… C           CT             NA    
## 17 rs2664156 Celiac disease                    C           CC             NA    
## 18 rs112032… Celiac disease or Rheumatoid art… A           AA             NA

one of two SNPs associated with increase Crohn’s

filter(mySNPs_gwas_table, have_risk_allele_count >= 1) %>%
 select(rsid, DISEASE.TRAIT, risk_allele = risk_allele_clean, your_geneotype = genotype, freq = RISK.ALLELE.FREQUENCY) %>% 
 filter(str_detect(tolower(DISEASE.TRAIT), "colorectal")) %>%
 distinct()
## # A tibble: 123 x 5
##    rsid    DISEASE.TRAIT                       risk_allele your_geneotype   freq
##    <chr>   <chr>                               <chr>       <fct>           <dbl>
##  1 rs7264… Colorectal cancer                   T           TT             NA    
##  2 rs7542… Colorectal cancer                   C           CC              0.273
##  3 rs1092… Metastasis in stage I-III microsat… T           CT             NA    
##  4 rs6691… Colorectal cancer                   T           GT             NA    
##  5 rs1701… Colorectal cancer or advanced aden… G           AG              0.209
##  6 rs6687… Colorectal cancer                   G           AG              0.24 
##  7 rs6687… Colorectal cancer                   G           AG             NA    
##  8 rs6687… Colorectal cancer                   G           AG              0.222
##  9 rs8850… Progression free survival in metas… A           AG             NA    
## 10 rs2163… Colorectal cancer                   G           GG              0.747
## # … with 113 more rows
filter(mySNPs_gwas_table, have_risk_allele_count >= 1) %>%
 select(rsid, DISEASE.TRAIT, risk_allele = risk_allele_clean, your_geneotype = genotype, freq = RISK.ALLELE.FREQUENCY) %>% 
 filter(freq >= 0.5) %>%
 filter(str_detect(tolower(DISEASE.TRAIT), "colorectal")) %>%
 distinct()
## # A tibble: 60 x 5
##    rsid       DISEASE.TRAIT                     risk_allele your_geneotype  freq
##    <chr>      <chr>                             <chr>       <fct>          <dbl>
##  1 rs2163735  Colorectal cancer                 G           GG             0.747
##  2 rs1162658… Colorectal cancer                 G           GG             0.923
##  3 rs651907   Colorectal cancer                 C           CC             0.54 
##  4 rs13086367 Colorectal cancer or advanced ad… A           AA             0.526
##  5 rs10049390 Colorectal cancer or advanced ad… A           AG             0.723
##  6 rs17035289 Colorectal cancer                 T           CT             0.83 
##  7 rs2735940  Colorectal cancer                 G           GG             0.61 
##  8 rs9271695  Colorectal cancer or advanced ad… G           GG             0.804
##  9 rs4946260  Colorectal cancer                 T           CT             0.53 
## 10 rs3801081  Colorectal cancer                 G           GG             0.68 
## # … with 50 more rows
filter(mySNPs, rsid == "rs16892766")
## # A tibble: 1 x 4
##   rsid       chromosome  position genotype
##   <chr>      <fct>          <int> <fct>   
## 1 rs16892766 8          117630683 AA
filter(mySNPs, rsid == "rs4779584")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs4779584 15         32994756 CC
filter(mySNPs, rsid == "rs58920878")
## # A tibble: 0 x 4
## # … with 4 variables: rsid <chr>, chromosome <fct>, position <int>,
## #   genotype <fct>

no increase of colorectal cancer

filter(mySNPs, rsid == "rs4939827")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs4939827 18         46453463 CC

0.73x decreased risk for colorectal cancer

While I do not believe I have a family history of celiac or crohns, it is very common in Jewish populations. While I do have some risk alleles associated with increase risk of celiac and crohns, and the allele frequencies are fairly high, I also have alleles assocaited with decrease risk celiac or crohns. Furthermore, SNP rs7574865 has a magnitude of 2.5 which SNPedia categorizes as something to keep in mind but not worry about and SNP rs3184504 had a magnitude of 1.4 which SNPedia had labelled as not too exciting. I found that a majority of my risk SNPs had no published magnitude associated with them.

Regarding my risk of colorectal cancer, I do have plenty of risk alleles with a risk frequency of greater than 0.5. However, none had a magnitude large enough to cause concern.

I have been “diagnosed” with asthma, but I have never experienced an asthma attack or have experienced troubled breathing. Therefore, I wanted to see if I have any SNPs associated with asthma.

filter(mySNPs_gwas_table, have_risk_allele_count >= 1) %>%
 select(rsid, DISEASE.TRAIT, risk_allele = risk_allele_clean, your_geneotype = genotype, freq = RISK.ALLELE.FREQUENCY) %>% 
 filter(str_detect(tolower(DISEASE.TRAIT), "asthma")) %>%
 distinct()
## # A tibble: 183 x 5
##    rsid     DISEASE.TRAIT                      risk_allele your_geneotype   freq
##    <chr>    <chr>                              <chr>       <fct>           <dbl>
##  1 rs734999 Asthma                             C           CT              0.51 
##  2 rs301806 Allergic disease (asthma, hay fev… T           CT              0.54 
##  3 rs12932… Asthma                             G           GG              0.87 
##  4 rs22285… Asthma                             T           GT              0.63 
##  5 rs22285… Asthma                             T           GT              0.631
##  6 rs22285… Asthma                             T           GT              0.643
##  7 rs48456… Asthma                             G           GG             NA    
##  8 rs41292… Asthma                             T           TT              0.37 
##  9 rs41292… Asthma                             T           TT              0.4  
## 10 rs903361 Asthma                             A           AG              0.659
## # … with 173 more rows
filter(mySNPs, rsid == "rs1695")
## # A tibble: 1 x 4
##   rsid   chromosome position genotype
##   <chr>  <fct>         <int> <fct>   
## 1 rs1695 11         67352689 AA

normal asthma risk in certain populations

filter(mySNPs, rsid == "rs2303067")
## # A tibble: 0 x 4
## # … with 4 variables: rsid <chr>, chromosome <fct>, position <int>,
## #   genotype <fct>

No asthma and atopic dermatitis SNP

filter(mySNPs, rsid == "rs4794067")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs4794067 17         45808828 CC

2.1x risk for Aspirin Induced Asthma. But possibly lower risk of lupus and intractable Graves’ disease with a magnitude of 2, which is something to keep in mind but not necessarily worry about. I was not aware that Aspirin Induced Asthma was a condition so it is good to know that I have increase risk of AIA.

The following SNPs are all associated with increased asthma risk if exposed to smoke

filter(mySNPs, rsid == "rs2305480")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs2305480 17         38062196 GG

~3x increased asthma risk if exposed to smoke with a magnitude of 2, which is something to keep in mind but not necessarily worry about.

filter(mySNPs, rsid == "rs4795400")
## # A tibble: 0 x 4
## # … with 4 variables: rsid <chr>, chromosome <fct>, position <int>,
## #   genotype <fct>

no SNP present

Overall a lot of my asthma alleles were not found on SNPedia or had no published magnitude; however, rs4129267 has a magnitude of 1.4 which is fairly low and does not personally cause me concern (even though I do have 2 copies of these). I also have SNPs such as rs1837253 that has associated with a 0.84x decreased risk of asthma with a magnitude of 2.2 (and I have multiple copies of this SNP) .

filter(mySNPs, rsid == "rs4680")
## # A tibble: 1 x 4
##   rsid   chromosome position genotype
##   <chr>  <fct>         <int> <fct>   
## 1 rs4680 22         19951271 GG

Warrior: Val, less exploratory, higher COMT enzymatic activity, therefore lower dopamine levels; higher pain threshold, better stress resiliency, albeit with a modest reduction in executive cognition performance under most conditions. Magnitude 2.5.

filter(mySNPs_gwas_table, have_risk_allele_count >= 1) %>%
 select(rsid, DISEASE.TRAIT, risk_allele = risk_allele_clean, your_geneotype = genotype, freq = RISK.ALLELE.FREQUENCY) %>% 
 filter(str_detect(tolower(DISEASE.TRAIT), "nicotine")) %>%
 distinct()
## # A tibble: 12 x 5
##    rsid      DISEASE.TRAIT                     risk_allele your_geneotype   freq
##    <chr>     <chr>                             <chr>       <fct>           <dbl>
##  1 rs1060061 Nicotine dependence               T           CT             NA    
##  2 rs4668485 Nicotine dependence symptom count T           CT             NA    
##  3 rs9379896 Nicotine dependence symptom count C           CC             NA    
##  4 rs623929… Nicotine dependence               T           TT             NA    
##  5 rs4132568 Nicotine glucouronidation         A           AA             NA    
##  6 rs117633… Nicotine dependence symptom count A           AG             NA    
##  7 rs4285401 Nicotine use                      A           AG              0.447
##  8 rs7385760 Nicotine dependence symptom count T           CT             NA    
##  9 rs108286… Nicotine dependence symptom count T           CT             NA    
## 10 rs169699… Fagerstr**m test for nicotine de… G           AG             NA    
## 11 rs8075300 Nicotine dependence symptom count C           CC              0.465
## 12 rs2836823 Nicotine dependence               T           CT              0.4
filter(mySNPs, rsid == "rs3750344 ")
## # A tibble: 0 x 4
## # … with 4 variables: rsid <chr>, chromosome <fct>, position <int>,
## #   genotype <fct>
filter(mySNPs, rsid == "rs1051730 ")
## # A tibble: 0 x 4
## # … with 4 variables: rsid <chr>, chromosome <fct>, position <int>,
## #   genotype <fct>

No nicotine dependence alleles

I was surprised to discover I had multiple risk alleles for nicotine dependence. This does not really apply to me because I have never smoked because of my asthma diagnosis.The SNP rs16969968, which I have with a magnitude of 2.5, is interestingly also associated with lower cocaine dependence as well as higher nicotine dependence. I did not realize that having a dependence to one “drug” may have an opposite response to ones dependence to another drug.

Drug Metabolism

filter(mySNPs_gwas_table, have_risk_allele_count >= 1) %>%
 select(rsid, DISEASE.TRAIT, risk_allele = risk_allele_clean, your_geneotype = genotype, freq = RISK.ALLELE.FREQUENCY) %>% 
 filter(str_detect(tolower(DISEASE.TRAIT), "drug")) %>%
 distinct()
## # A tibble: 24 x 5
##    rsid    DISEASE.TRAIT                       risk_allele your_geneotype   freq
##    <chr>   <chr>                               <chr>       <fct>           <dbl>
##  1 rs7800… Medication use (drugs used in diab… T           CT              0.384
##  2 rs6755… QT interval (drug interaction)      T           CT              0.32 
##  3 rs1495… Cough in response to angiotensin-c… C           CT             NA    
##  4 rs2844… Drug-induced Stevens-Johnson syndr… C           CC              0.62 
##  5 rs3130… Drug-induced Stevens-Johnson syndr… C           CC              0.69 
##  6 rs3130… Drug-induced Stevens-Johnson syndr… G           GG              0.74 
##  7 rs3094… Drug-induced Stevens-Johnson syndr… A           AA              0.63 
##  8 rs4235… Adverse response to chemotherapy (… A           AG              0.74 
##  9 rs2505… Adverse response to chemotherapy (… G           AG              0.398
## 10 rs8491… Medication use (drugs used in diab… T           CT              0.411
## # … with 14 more rows
filter(mySNPs, rsid == "rs4986893")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs4986893 10         96540410 GG
filter(mySNPs, rsid == "rs28399504")
## # A tibble: 1 x 4
##   rsid       chromosome position genotype
##   <chr>      <fct>         <int> <fct>   
## 1 rs28399504 10         96522463 AA
filter(mySNPs, rsid == "rs41291556")
## # A tibble: 1 x 4
##   rsid       chromosome position genotype
##   <chr>      <fct>         <int> <fct>   
## 1 rs41291556 10         96535173 TT

normal metabolizer of several commonly prescribed drugs

filter(mySNPs, rsid == "rs12248560")
## # A tibble: 1 x 4
##   rsid       chromosome position genotype
##   <chr>      <fct>         <int> <fct>   
## 1 rs12248560 10         96521657 CT

ultra fast metabolizer of proton pump inhibitors and benefit from tamoxifen treatment; drug metabolism effects; also 0.77x decreased breast cancer risk. Magnitude 2.

filter(mySNPs, rsid == "rs8099917")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs8099917 19         39743165 GT

Moderately lower odds of responding to PEG-IFNalpha/RBV treatment (Hepatitis C treatments). Magnitude 2.

filter(mySNPs, rsid == "rs1057910")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs1057910 10         96741053 AC

average 40% reduction in warfarin metabolism (1/2 SNPs). Magnitude 2.5.

filter(mySNPs, rsid == "rs1800460")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs1800460 6          18139228 CT

impaired capability of detoxifying byproducts of certain drugs (antineoplastic and immunosuppressant). Magnitude 3.

filter(mySNPs, rsid == "rs1800462")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs1800462 6          18143955 CC

incapable of detoxifying certain drugs (antineoplastic and immunosuppressant). Magnitude 3.5.

filter(mySNPs, rsid == "rs1142345")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs1142345 6          18130918 CT

impaired drug metabolism (antineoplastic and immunosuppressant). Magnitude 3.

The three SNP associated with antineoplastic and immunosuppressant, both of mine which are broken, also have the highest magnitude. A magnitude of 3.5 is labelled by SNPedia as something worth my time, meaning I should bring it up to my doctor.

filter(mySNPs, rsid == "rs11212617")
## # A tibble: 1 x 4
##   rsid       chromosome  position genotype
##   <chr>      <fct>          <int> <fct>   
## 1 rs11212617 11         108283161 AC

Somewhat increased likelihood of treatment success with metformin (helps with diabetes which I have increase chance of). Magnitude 1.5.

filter(mySNPs, rsid == "rs2395029")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs2395029 6          31431780 TT

no increase risk for drug-induced liver injury when prescribed flucloxacillin

Overall, I was shocked to find out that I have three SNPs associated with antineoplastic and immunosuppressant metabolism and one is weak and the other is broke. This is not something I have ever thought about but now am glad to know to be aware for the future.

Ashkenazi related alleles

Ashkenazi Jews are linked with a higher frequency of hereditary genetic disorders, so I was curiious (and a bit nervous) to see if I am a carrier for any hereditary dieases linked to Ashkenazi Jews.

filter(mySNPs, rsid == "rs11209026")
## # A tibble: 1 x 4
##   rsid       chromosome position genotype
##   <chr>      <fct>         <int> <fct>   
## 1 rs11209026 1          67705958 GG

higher risk for certain autoimmune diseases. Magnitude 1.1.

filter(mySNPs, rsid == "rs386833395")
## # A tibble: 1 x 4
##   rsid        chromosome position genotype
##   <chr>       <fct>         <int> <fct>   
## 1 rs386833395 17         41276045 II
filter(mySNPs, rsid == "rs80357906")
## # A tibble: 1 x 4
##   rsid       chromosome position genotype
##   <chr>      <fct>         <int> <fct>   
## 1 rs80357906 17         41209083 DD

no BRCA1 variants

filter(mySNPs, rsid == "rs80359550")
## # A tibble: 1 x 4
##   rsid       chromosome position genotype
##   <chr>      <fct>         <int> <fct>   
## 1 rs80359550 13         32914438 II

no BRCA2 variant

not a carrier for cystic fibrosis

filter(mySNPs, rsid == "rs121965064")
## # A tibble: 1 x 4
##   rsid        chromosome  position genotype
##   <chr>       <fct>          <int> <fct>   
## 1 rs121965064 4          187201412 TT
filter(mySNPs, rsid == "rs373297713")
## # A tibble: 0 x 4
## # … with 4 variables: rsid <chr>, chromosome <fct>, position <int>,
## #   genotype <fct>
filter(mySNPs, rsid == "rs121965063")
## # A tibble: 1 x 4
##   rsid        chromosome  position genotype
##   <chr>       <fct>          <int> <fct>   
## 1 rs121965063 4          187195347 GG

not a carrier of hemophilia C (1/23 Ashkenazi are carriers)

filter(mySNPs, rsid == "rs111033171")
## # A tibble: 0 x 4
## # … with 4 variables: rsid <chr>, chromosome <fct>, position <int>,
## #   genotype <fct>
filter(mySNPs, rsid == "rs137853022")
## # A tibble: 0 x 4
## # … with 4 variables: rsid <chr>, chromosome <fct>, position <int>,
## #   genotype <fct>
filter(mySNPs, rsid == "rs28939712")
## # A tibble: 0 x 4
## # … with 4 variables: rsid <chr>, chromosome <fct>, position <int>,
## #   genotype <fct>

not a Familial dysautonomia carrier

Checking for random SNPs

filter(mySNPs, rsid == "rs333")
## # A tibble: 0 x 4
## # … with 4 variables: rsid <chr>, chromosome <fct>, position <int>,
## #   genotype <fct>

No resistance to HIV

filter(mySNPs, rsid == "rs662799")
## # A tibble: 1 x 4
##   rsid     chromosome  position genotype
##   <chr>    <fct>          <int> <fct>   
## 1 rs662799 11         116663707 AG

1.4x higher early heart attack risk; less weight gain on high fat diets. Magnitude 2.0.

filter(mySNPs, rsid == "rs7495174")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs7495174 15         28344238 AA

blue/gray eyes more likely

filter(mySNPs, rsid == "rs12913832")
## # A tibble: 1 x 4
##   rsid       chromosome position genotype
##   <chr>      <fct>         <int> <fct>   
## 1 rs12913832 15         28365618 GG

blue eye color, 99% of the time

filter(mySNPs, rsid == "rs1799971")
## # A tibble: 1 x 4
##   rsid      chromosome  position genotype
##   <chr>     <fct>          <int> <fct>   
## 1 rs1799971 6          154360797 AA

No stronger alcohol cravings

filter(mySNPs, rsid == "rs4988235")
## # A tibble: 1 x 4
##   rsid      chromosome  position genotype
##   <chr>     <fct>          <int> <fct>   
## 1 rs4988235 2          136608646 AA

Can digest lactose

filter(mySNPs, rsid == "rs590787")
## # A tibble: 1 x 4
##   rsid     chromosome position genotype
##   <chr>    <fct>         <int> <fct>   
## 1 rs590787 1          25629943 AG

Rh +. I knew I was type A, now I know Im A+

filter(mySNPs, rsid == "rs4675690")
## # A tibble: 1 x 4
##   rsid      chromosome  position genotype
##   <chr>     <fct>          <int> <fct>   
## 1 rs4675690 2          208507807 TT

show less disgust

filter(mySNPs, rsid == "rs1015362")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs1015362 20         32738612 CT

2-4x higher risk of sun sensitivity if part of risk haplotype. Magnitude 2.

filter(mySNPs, rsid == "rs4911414")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs4911414 20         32729444 GT

2-4x higher risk of sun sensitivity if part of risk haplotype. MAgnitude 2.

filter(mySNPs, rsid == "rs12821256")
## # A tibble: 1 x 4
##   rsid       chromosome position genotype
##   <chr>      <fct>         <int> <fct>   
## 1 rs12821256 12         89328335 TT

no additional likelyhood of blonde hair

filter(mySNPs, rsid == "rs12203592")
## # A tibble: 1 x 4
##   rsid       chromosome position genotype
##   <chr>      <fct>         <int> <fct>   
## 1 rs12203592 6            396321 CT

likely presence of freckles, brown hair and high sensitivity of skin to sun exposure.

filter(mySNPs, rsid == "rs35264875")
## # A tibble: 1 x 4
##   rsid       chromosome position genotype
##   <chr>      <fct>         <int> <fct>   
## 1 rs35264875 11         68846399 AT

one blonde variant

filter(mySNPs, rsid == "rs12896399")
## # A tibble: 1 x 4
##   rsid       chromosome position genotype
##   <chr>      <fct>         <int> <fct>   
## 1 rs12896399 14         92773663 TT

Lighter hair color & blue eyes more likely

filter(mySNPs, rsid == "rs1042522")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs1042522 17          7579472 CC

Live 3 years longer. Chemotherapy is more effective. Magnitude 3, pretty exciting.

filter(mySNPs, rsid == "rs6968865")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs6968865 7          17287269 TT

Associated with (slightly) increased coffee consumption

filter(mySNPs_gwas_table, have_risk_allele_count >= 1) %>%
 select(rsid, your_genotype = genotype, strongest_risk_allele = risk_allele_clean, DISEASE.TRAIT, STUDY) 
## # A tibble: 23,142 x 5
##    rsid    your_genotype strongest_risk_a… DISEASE.TRAIT    STUDY               
##    <chr>   <fct>         <chr>             <chr>            <chr>               
##  1 rs1126… CT            C                 IgG glycosylati… Loci associated wit…
##  2 rs2803… AA            A                 Body mass index  Meta-analysis of ge…
##  3 rs2803… AA            A                 Body mass index  Meta-analysis of ge…
##  4 rs4252… CT            T                 Height           Hundreds of variant…
##  5 rs1079… CT            C                 Ulcerative coli… Host-microbe intera…
##  6 rs7349… CT            C                 Ulcerative coli… Meta-analysis ident…
##  7 rs7349… CT            C                 Asthma           Genome-wide analysi…
##  8 rs3748… AG            A                 Primary scleros… Genome-wide associa…
##  9 rs3748… AG            A                 Primary scleros… Dense genotyping of…
## 10 rs3890… CT            T                 Rheumatoid arth… Common variants at …
## # … with 23,132 more rows
datatable(
 filter(mySNPs_gwas_table, have_risk_allele_count >= 1) %>%
 select(rsid, your_genotype = genotype, strongest_risk_allele = risk_allele_clean, DISEASE.TRAIT, STUDY )
)
## Warning in instance$preRenderHook(instance): It seems your data is too big
## for client-side DataTables. You may consider server-side processing: https://
## rstudio.github.io/DT/server.html
datatable(
 filter(mySNPs_gwas_table,have_risk_allele_count > 0 & (str_detect(tolower(INITIAL.SAMPLE.SIZE), "european") | str_detect(tolower(REPLICATION.SAMPLE.SIZE), "european")) & (RISK.ALLELE.FREQUENCY > 0 & !is.na(RISK.ALLELE.FREQUENCY))) %>%
 arrange(RISK.ALLELE.FREQUENCY) %>%
 select(rsid, your_genotype = genotype, DISEASE.TRAIT, INITIAL.SAMPLE.SIZE,RISK.ALLELE.FREQUENCY)
 )
## Warning in instance$preRenderHook(instance): It seems your data is too big
## for client-side DataTables. You may consider server-side processing: https://
## rstudio.github.io/DT/server.html

Overall summary of all the potential risk alleles I have

datatable(
 filter(mySNPs_gwas_table,have_risk_allele_count > 0 & (str_detect(tolower(INITIAL.SAMPLE.SIZE), "european") | str_detect(tolower(REPLICATION.SAMPLE.SIZE), "european")) & (RISK.ALLELE.FREQUENCY > 0.9 & !is.na(RISK.ALLELE.FREQUENCY))) %>%
 arrange(RISK.ALLELE.FREQUENCY) %>%
 select(rsid, your_genotype = genotype, DISEASE.TRAIT, INITIAL.SAMPLE.SIZE,RISK.ALLELE.FREQUENCY)
 )

The increase risk of Intraocular pressure is very interesting because I am extremely suspectible to migrains and headaches in general.

I ha

trait_entry_count <- group_by(mySNPs_gwas_table, DISEASE.TRAIT) %>%
 filter(have_risk_allele_count >= 1) %>%
 summarise(count_of_entries = n())

ggplot(filter(trait_entry_count, count_of_entries > 100), aes(x = reorder(DISEASE.TRAIT, count_of_entries, sum), y = count_of_entries)) +
 geom_col() +
 coord_flip() +
 theme_bw() +
 labs(title = "Which traits I have the risk allele for\nthat have over 100 entries in the GWAS database?", y = "Count of entries", x = "Trait")

# Summarise proportion of SNPs for a given trait where you have a risk allele
trait_snp_proportion <-  filter(mySNPs_gwas_table, risk_allele_clean %in% c("C" ,"A", "G", "T") & my_allele_1 %in% c("C" ,"A", "G", "T") & my_allele_2 %in% c("C" ,"A", "G", "T") ) %>%
mutate(you_have_risk_allele = if_else(have_risk_allele_count >= 1, 1, 0)) %>%
 group_by(DISEASE.TRAIT, you_have_risk_allele) %>%
 summarise(count_of_snps = n_distinct(rsid)) %>%
 mutate(total_snps_for_trait = sum(count_of_snps), proportion_of_snps_for_trait = count_of_snps / sum(count_of_snps) * 100) %>%
 filter(you_have_risk_allele == 1) %>%
 arrange(desc(proportion_of_snps_for_trait)) %>%
 ungroup()
## `summarise()` has grouped output by 'DISEASE.TRAIT'. You can override using the `.groups` argument.
trait_study_count <- filter(mySNPs_gwas_table, risk_allele_clean %in% c("C" ,"A", "G", "T") & my_allele_1 %in% c("C" ,"A", "G", "T") & my_allele_2 %in% c("C" ,"A", "G", "T") ) %>%
 group_by(DISEASE.TRAIT) %>%
 summarise(count_of_studies = n_distinct(PUBMEDID), mean_risk_allele_freq = mean(RISK.ALLELE.FREQUENCY))


trait_snp_proportion <- inner_join(trait_snp_proportion, trait_study_count, by = "DISEASE.TRAIT")

ggplot(filter(trait_snp_proportion, count_of_studies > 1 & proportion_of_snps_for_trait > 70), aes(x = reorder(DISEASE.TRAIT, proportion_of_snps_for_trait, sum), y = proportion_of_snps_for_trait, fill = mean_risk_allele_freq)) +
 geom_col() +
 coord_flip() +
 theme_bw() + 
 labs(title = "Traits I have more than half of the risk\nalleles studied where > 1 studies involved", 
 y = "% of SNPs with risk allele", x = "Trait", fill = "Mean risk allele frequency") 

datatable(trait_snp_proportion)
datatable(
 filter(mySNPs_gwas_table,have_risk_allele_count > 0 & (str_detect(tolower(INITIAL.SAMPLE.SIZE), "european") | str_detect(tolower(REPLICATION.SAMPLE.SIZE), "european")) & (RISK.ALLELE.FREQUENCY > 0. & !is.na(RISK.ALLELE.FREQUENCY))) %>%
 arrange(RISK.ALLELE.FREQUENCY) %>%
 select(rsid, your_genotype = genotype, DISEASE.TRAIT, INITIAL.SAMPLE.SIZE,RISK.ALLELE.FREQUENCY)
 )
## Warning in instance$preRenderHook(instance): It seems your data is too big
## for client-side DataTables. You may consider server-side processing: https://
## rstudio.github.io/DT/server.html
datatable(
 filter(mySNPs_gwas_table, have_risk_allele_count == 2) %>%
 select(rsid, your_genotype = genotype, strongest_risk_allele = risk_allele_clean, DISEASE.TRAIT, STUDY )
)
## Warning in instance$preRenderHook(instance): It seems your data is too big
## for client-side DataTables. You may consider server-side processing: https://
## rstudio.github.io/DT/server.html

SNP where I have both risk alleles

8. Recommendations for next steps

I should probably see a gastroenterologist regarding my already increased susceptibility of certain gastrointestinal conditions being Ashkenazi, but my testing results reaffirmed the fact that I should visit a gastroenterologist and most likely get further testing done, especially considering I have undiagnosed gastrointenstinal issues. I was not aware that I was predisposed to type-2 diabetes which does not require mentioning to my medical provider but I should take into consideration in my lifestyle choices. Furthermore, what I discovered regarding drug and medication metabolism shocked me, I was not at all aware that I had so many SNPs associated with drug metabolism incapabilities. That is something I will most definitely inform my medical provider of. Additionally, purely for curiosity reasons I would like to have my mother genetically tested because her family history is such a mystery.